Scheduling distributed multiway spatial join queries: optimization models and algorithms
نویسندگان
چکیده
Multiway spatial joins are a commonly occurring and fundamental type of query for data processing. This article presents models algorithms to schedule this in distributed database systems while attempting strike balance between makespan communication costs. We propose three based on combinatorial optimization methods: the well-known linear relaxation technique rounding solution generated by programming (LP), more sophisticated Lagrangian Relaxation method (LR), as well greedy heuristic (GR) baseline comparison. Our evaluation shows that built using GR consumes, average, 22% processing resources than elaborate constructed via LR method, when scheduling 64 machines. The provided is also, an order magnitude closer optimal compared GR. show Gigabyte-size multiway queries before execution can reduce its time state-of-the-art frameworks do not have capability, significantly amount shuffled network.
منابع مشابه
Cost Models for Join Queries in Spatial Databases
The join query is one of the fundamental operations in Data Base Management Systems (DBMSs). Modern DBMSs should be able to support non-traditional data, including spatial objects, in an efficient manner. Towards this goal, spatial data structures can be adopted in order to support the execution of join queries on sets of multidimensional data. This paper introduces analytical models that estim...
متن کاملAn Effective High-Performance Multiway Spatial Join Algorithm with Spark
Multiway spatial join plays an important role in GIS (Geographic Information Systems) and their applications. With the increase in spatial data volumes, the performance of multiway spatial join has encountered a computation bottleneck in the context of big data. Parallel or distributed computing platforms, such as MapReduce and Spark, are promising for resolving the intensive computing issue. P...
متن کاملJTop Algorithms for Top-k Join Queries
Top-k join queries have become very important in many important areas of computing. One of the most efficient algorithms for top-k join queries is the Rank-Join algorithm [17] [18]. However, there are many cases where Rank-Join does much unnecessary access to the input data sources. In this report, we first show that there are many cases where Rank-Join's stopping mechanism is not efficient, an...
متن کاملSearch algorithms for multiway spatial joins
This papers deals with multiway spatial joins when (i) there is limited time for query processing and the goal is to retrieve the best possible solutions within this limit (ii) there is unlimited time and the goal is to retrieve a single exact solution, if such a solution exists, or the best approximate one otherwise. The first case is motivated by the high cost of join processing in real-time ...
متن کاملData-Parallel Spatial Join Algorithms
E cient data-parallel spatial join algorithms for pmr quadtrees and R-trees, common spatial data structures, are presented. The domain consists of planar line segment data (i.e., Bureau of the Census TIGER/Line les). Parallel algorithms for map intersection and a spatial range query are described. The algorithms are implemented using the SAM (Scan-AndMonotonic-mapping) model of parallel computa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Geographical Information Science
سال: 2023
ISSN: ['1365-8824', '1365-8816']
DOI: https://doi.org/10.1080/13658816.2023.2170380